fix: prevent path traversal via user_id in playground Flask app (CWE-22)#68
Open
sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
Open
fix: prevent path traversal via user_id in playground Flask app (CWE-22)#68sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
sebastiondev wants to merge 1 commit intoBAI-LAB:mainfrom
Conversation
…-22) Add _is_safe_id() validation to the /init_memory endpoint to reject user_id values containing path separators, ".." components, null bytes, or other characters that could escape the intended data directory. Without this check, an attacker can POST user_id="../../etc/cron.d" to create arbitrary directories (via os.makedirs in the Memoryos constructor) or delete arbitrary directory trees (via shutil.rmtree in /clear_memory). The fix uses an allowlist regex (alphanumeric, hyphens, underscores, dots) applied at the HTTP boundary before the value reaches any filesystem operation.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The
/init_memoryendpoint inmemoryos-playground/memdemo/app.pyaccepts auser_idfrom the JSON request body and passes it directly intoos.path.join(data_storage_path, "users", user_id)(and into several derived file paths forshort_term.json,mid_term.json,long_term_user.json, and the per-assistant directory). Because the value is never validated, a request withuser_id="../something"escapes the intendedusers/directory and causes Memoryos to create/overwrite files at attacker-controlled locations on disk.memoryos-playground/memdemo/app.py—init_memory()request.get_json()['user_id']→Memoryos(user_id=…, data_storage_path=…)→os.path.join(data_storage_path, "users", user_id)and subsequent file opens.The demo app binds
0.0.0.0:5019withdebug=Trueand has no authentication on this endpoint, so any host that can reach the port can trigger the sink.Fix
Added an
_is_safe_id()helper and called it at the top ofinit_memory()before the value is used to construct any path. The helper:^[A-Za-z0-9][A-Za-z0-9._-]*$(alphanumerics, dot, hyphen, underscore; must start with an alphanumeric).\x00),/,\, and any./..component.If validation fails the endpoint returns
400with a clear error message; nothing else in the request handler changes. The check sits in front of the only placeuser_identers the system from the network — other endpoints look up an already-initialised entry inmemory_systemskeyed by session, so there is no parallel unfixed sink to worry about.Tests
Added
tests/test_cwe22_app_flask.py, which uses Flask's test client and a stubMemoryosto exercise/init_memorydirectly without pulling in the ML stack. It asserts that:../etc/cron.d,..\windows\system32,foo/../../etc/passwd,foo/../../../tmp/evil,....//....//etc,/absolute/path,.,...alice,bob_123,user-2024,CamelCaseUser.All 12 cases pass locally.
Why this is exploitable
The endpoint is reachable without authentication, the input is taken verbatim from JSON, and
os.path.joindoes not collapse..segments — sousers/../../../tmp/evil/short_term.jsonresolves outsidedata_storage_path. Memoryos then creates the parent directory (os.makedirs(..., exist_ok=True)) and writes JSON files there, giving an attacker arbitrary file creation/overwrite within whatever the demo process can write to. Withdebug=Trueand a network-bound listener this is realistic for anyone who runs the playground on a shared network or container.Before submitting, we tried to disprove this: we checked whether the surrounding framework, an upstream proxy, or the
Memoryosconstructor itself normalises or rejects traversal inuser_id. None of them do —Memoryoshappily uses whatever path it is given, and there is no auth/middleware in front of/init_memory. The validation has to live at the request handler, which is what this PR adds.cc @lewiswigmore